Towards More Efficient DNN-Based Speech Enhancement Using Quantized Correlation Mask
نویسندگان
چکیده
Many studies on deep learning-based speech enhancement (SE) utilizing the computational auditory scene analysis method typically employs ideal binary mask or ratio to reconstruct enhanced signal. However, many SE applications in real scenarios demand a desirable balance between denoising capability and cost. In this study, first, an improvement over attain more superior performance is proposed through introducing efficient adaptive correlation-based factor for adjusting mask. The exploits correlation coefficients among noisy speech, noise clean effectively re-distribute power of during construction phase. Second, make supervised system computationally-efficient, quantization techniques are considered reduce number bits needed represent floating numbers, leading compact model. quantized utilized conjunction with 4-layer neural network (DNN-QCM) comprising dropout regulation, pre-training noise-aware training derive robust high-order mapping enhancement, improve generalization unseen conditions. Results show that outperforms conventional representation other algorithms used comparison. When compared DNN as its learning targets, DNN-QCM provided approximately 6.5% short-time objective intelligibility score 11.0% perceptual evaluation quality score. introduction can weights 5-bit from 32-bit, while suppressing stationary non-stationary noise. Timing analyses also incorporated increase compactness, inference time be reduced by 15.7% 10.5%, respectively.
منابع مشابه
Integration of DNN based speech enhancement and ASR
Speech enhancement employing Deep Neural Networks (DNNs) is gaining strength as a data-driven alternative to classical Minimum Mean Square Error (MMSE) enhancement approaches. In the past, Observation Uncertainty approaches to integrate MMSE speech enhancement with Automatic Speech Recognition (ASR) have yielded good results as a lightweight alternative for robust ASR. In this paper we thus exp...
متن کاملTowards minimum perceptual error training for DNN-based speech synthesis
We propose to use a perceptually-oriented domain to improve the quality of text-to-speech generated by deep neural networks (DNNs). We train a DNN that predicts the parameters required for speech reconstruction but whose cost function is calculated in another domain. In this paper, to represent this perceptual domain we extract an approximated version of the SpectroTemporal Excitation Pattern t...
متن کاملSpeech Enhancement using Adaptive Data-Based Dictionary Learning
In this paper, a speech enhancement method based on sparse representation of data frames has been presented. Speech enhancement is one of the most applicable areas in different signal processing fields. The objective of a speech enhancement system is improvement of either intelligibility or quality of the speech signals. This process is carried out using the speech signal processing techniques ...
متن کاملDNN-Based Feature Enhancement Using Joint Training Framework for Robust Multichannel Speech Recognition
Ever since the deep neural network (DNN) appeared in the speech signal processing society, the recognition performance of automatic speech recognition (ASR) has been greatly improved. Due to this achievement, the demands on various applications in distant-talking environment also have been increased. However, ASR performance in such environments is still far from that in close-talking environme...
متن کاملStudent-Teacher Learning for BLSTM Mask-based Speech Enhancement
Spectral mask estimation using bidirectional long short-term memory (BLSTM) neural networks has been widely used in various speech enhancement applications, and it has achieved great success when it is applied to multichannel enhancement techniques with a mask-based beamformer. However, when these masks are used for single channel speech enhancement they severely distort the speech signal and m...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: IEEE Access
سال: 2021
ISSN: ['2169-3536']
DOI: https://doi.org/10.1109/access.2021.3056711